Introduction: Corpus Description

What is your corpus, why did you choose it, and what do you think is interesting about it?

Synthwave (also called outrun, retrowave, or futuresynth) is an electronic music microgenre that is based predominantly on the music associated with action, science-fiction, and horror film soundtracks of the 1980s. Other influences are drawn from the decade’s art and video games. Synthwave musicians often espouse nostalgia for 1980s culture and attempt to capture the era’s atmosphere and celebrate it. (from Wikipedia)

I chose this corpus because I am currently listening to a lot of synthwave music. I also listen to all my music on Spotify, so I can use my playlists to build my corpus up quickly. I am also currently working on a synthwave rhythm-game in unity, so exploring this corpus could also help with this side project.

I have selected the following artists, which should represent the following subgenres:

What are the natural groups or comparison points in your corpus and what is expected between them?

I intend to divide the corpus based on subgenres within synthwave. These subgenres are similar to the subgenres within heavy metal, but likely a lot more subtle. It would therefore be interesting to see if these subgenres are actually detectable within the corpus. I personally do not think there are very significant differences between most of these subgenres, but we will see if the data agrees with that statement. I want to approach this by selecting five of the most different synthwave-artists I enjoy listening to, and by looking if these differences are perceivable by the music-visualization methods mentioned during the course.

How representative are the tracks in your corpus for the groups you want to compare?

I will use a couple of playlists for each artist to build the corpus, making sure all artists take up an (close to) equal share of the corpus. This will ensure that all artists are equally represented. For subgenre detection representativeness is debatable. It is almost universally agreed upon that Home is one of the OG’s of Vaporwave. Jan Hammer is also a very respected soundtrack artist in the scene. The other genres are probably more debatable. But the same goes for the more niche subgenres of hardcore rock, so I believe the experiment will still be compelling.

Identify several tracks in your corpus that are either extremely typical or atypical. Why do you think that they are so typical or so strange?

Turbo Killer by Carpenter Brut is one atypical outlier. It is by far the most intense/speedy track in the entire corpus, even by Carpenter Brut standards. I also think that Resonance from Home may be an outlier. It has very odd sounds, even for vaporwave standards, I do not think there is anything that sounds close to this track. Respirate (Downtown Binary Remix) is the final outlier in this corpus. This is obviously because it is a remix of a song not originally created by Downtown Binary, but I think it still retains the Downtown Binary style, therefore I expect it to be interesting in analysis.

Source & Reproducibility

The whole project uses Spotify API as the source, and by hitting the Github “Source code” button, all code used to generate all visualizations can be verified on accuracy and methodology.


About the playlists

The corpus playlist contains all the tracks used in analysis. I chose to also include the old corpus, because it has a much greater listening experience. It contains a lot more variety taking the best songs from each album, while the new corpus includes more albums in their entirety.

Corpus distribution


The basics

When working with a new dataset, it is often a great idea to create some basic plots to get a feeling for the dataset, before diving into the actual research. The following two histograms will hopefully give a crude visualization of some properties of the chosen corpus.

Low popularity

Lets start of by looking at the popularity of the songs in the matrix. Spotify API assigns a popularity value to each track from 0 to 100. We can see that most songs have about 50 popularity. With some outliers close to minimum and maximum popularity. The corpus has a surprising amount of popularity while not being a well known genre (or maybe it is?). It would be nice if Spotify would explain how it determines popularity.

Low Speechness

It is clear from this histogram that speechness is very low in the corpus. This makes a lot of sense, because most tracks in the corpus do not contain any vocals. What the plot does show us is that the expected speechness values and the Spotify provided values do indeed match up, which is good.

Energy & Valence


Vibe checking

Energy and Valence can convey the mood of songs. We can see clearly that almost all songs have high energy, with Carpenter Brut having the highest average energy. Jan Hammer also seems to have the highest valence tracks. The song with the lowest energy is Night Talk, which has the low energy score of 0.258. It is good that most songs of each artist do hover around the same valence and energy, but there is a lot of overlap. This means that this data is propably not enough on its own to guess the artist for any given song.

Dynamic Time Warping


Comparing original to remaster

Jan Hammer’s most iconic soundtrack has to be ‘Crockett’s Theme’ from Miami Vice. Miami Vice is quite old now (originally aired in 1984), but Jan Hammer’s work on the soundtrack is great, and went on to inspire a lot of the modern Synthwave artists. Jan Hammer recently (2018) released a remaster of ‘Crockett’s Theme’ in the ‘Special edition’ album. I like this remaster better, but it is very subtly different from the original. Therefore it is probably the perfect candidate for this comparison.

Analysis

The plot shows that the original and the remastered soundtrack are indeed very similar.

Self Similarity Matrix: Chroma and Timbre


Song Choice

I wanted to explore the song ‘Turbo Killer’ by ‘Carpenter Brut’. This song really stands out for quick pacing and constantly building and escalating upon the previous ‘verse’. The song gives a sense of progression or fast movement/speed, and that is probably why this is one of my favorite tracks in this corpus.

The Chroma matrix

When looking at the Chroma matrix we see a lot of tiny changes in the first forty seconds. Every seven-ish seconds there is a change. After 40 seconds the song changes into high tempo guitar only, and the following blocks all add additional elements to this. The matrix turns out very interesting, because you can see how the song constantly builds up to more complexity.

The Timbre matrix

The Timbre matrix looks less interesting. There are no real verses, but you can tell where transitions to more complexity happen. What is very surprising is that the high point of the timbre matrix is at the end of the song.

Chordogram


Chord comparison

I have chosen Days of Thunder & Wild Ones from to corpus to do chordogram analysis with. Both songs contain vocals and have more ‘normal’ structure compared to most other synthwave tracks in the corpus. Days of thunder seems to have less chord variation, while also being longer. Wild Ones has a lot of chord changes in the beginning, but the final stretch also seems to have no significant chord changes. It is surprising that both songs come out so different, because while the tracks do not sound the same, I do think that both artists make similar kinds of synthwave music.

Tempogram


Tempo Analysis

Nexus from Downtown Binary gives some interesting results in tempo analysis, it has three very distinctive phases.The intro phase takes the first 90 seconds of the song, and has very fluctuating tempo. We get a very noisy tempogram here. After this the second phase lasts until 115 seconds and has more clearly defined BPM, as it slowly bridges the intro and third phase. The third phase has a very clear 105 BPM line and a less clear 158 BPM line.

Hang’em all is an interesting song to compare with Nexus, Because Nexus is more chill, while Carpenter Brut’s Hang’em all is more powerful/aggressive. Hang’em contains both high and low intensity sections. The first 50 seconds of the track start intense, followed by a low intensity subsection. You can clearly observe this in the Tempo plot, which becomes less noisy and a more consistent line after these first 50 seconds. From 85 to 120 seconds there is a more intense subsection, followed by low intensity that blends into high, which is unexpected, because the whole section looks like the first 50 seconds of the low intensity track section.

Comparing

Hang’em all consistently has higher BPM compared to Nexus, which was to be expected. While Hang’em all does contain relaxed parts, it always feels more faster paced than Nexus. The lowest intensity/buildup phases of both seem to have BPM’s that are not clearly defined by the plot, but when the main parts of both songs happen the BPM line becomes very sharp and accurate. Average BPM seems to be a reasonable and usable statistic to differentiate between Carpenter Brut and Downtown Binary.

Machine Learning using KNN


K-Nearest Neighbors

After all previous analysis, it is finally time to see if the research-question indeed holds true. This will be tested Using K-nearest neighbors. This algorithm makes use of all the spotify API data we have gotten accustomed to in previous analysis, and uses it to predict the artist. To validate these results, we do 5-K fold validation. This means that we train the model on 80% of the data and validate on the remaining 20%. This is then repeated until all 20% slices of the corpus have been in the validation partition once, giving a more honest non-cherrypicked representation of the results.

Results

It is obvious that the KNN classifier is very accurate when predicting artist. It achieves high accuracy on Carpenter Brut, and also performs decent on Jan Hammer and The Midnight. Downtown Binary and Home seem to be more difficult.